Contents

Abstract

In order to make light of cancer development, it is crucial to understand which genes play a role in the mechanisms linked to this disease and moreover which role that is. Commonly biological processes such as proliferation and apoptosis have been linked to cancer progression. Based on expression data we perform functional enrichment analysis, infer gene regulatory networks and upstream regulator analysis to score the importance of well-known biological processes with respect to the studied cancer. We then use these scores to predict two specific roles: genes that act as tumor suppressor genes (TSGs) and genes that act as oncogenes (OCGs). This methodology not only allows us to identify genes with dual role (TSG in one cancer type and OCG in another) but also to elucidate the underlying biological processes.

Introduction

Cancer development is influenced by mutations in two distinctly different categories of genes, known as tumor suppressor genes (TSG) and oncogenes (OCG). The occurrence of mutations in genes of the first category leads to faster cell proliferation while mutations in genes of second category increases or changes their function. We propose MoonlightR a new approach to define TSGs and OCGs based on functional enrichment analysis, infer gene regulatory networks and upstream regulator analysis to score the importance of well-known biological processes with respect to the studied cancer.

Moonlight’s pipeline

The figure from Moonlight’s pipeline is shown below: plot of chunk unnamed-chunk-2

Moonlight’s proposed workflow

The proposed pipeline consists of following eight steps:

  1. getDataTCGA for Data collection: expression levels of genes in all samples obtained with IlluminaHiSeq RNASeqV2 in 18 normal tissues (NT) and 18 cancer tissues (CT) according to TCGA criteria, and GEO data set matched to one of the 18 given TCGA cancer types as described in following Table TCGA / GEO.
  2. DPA Differential Phenotype Analysis (DEA) to identify genes or probes that are different significantly with two phenotypes such as normal and tumor, or normal and stageI, normal and molecular subtype.
  3. FEA Functional Enrichment Analysis (EA), using Fisher’s test, to identify gene sets (with biological functions linked to cancer1) significantly enriched by RG.
  4. GRN Gene regulatory network inferred between each single DEG (sDEG) and all genes obtained by means of mutual information, obtaining for each DEG a list of regulated genes (RG).
  5. URA Upstream Regulator Analysis For DEGs in each enriched gene set, we applied Z.Score being the ratio between the sum of all predicted effects for all the gene involved in the specific function and the square-root of the number of all genes.
  6. PRA Pattern recognition analysis identifies candidate TCGs (down) and OCGs (up). We either use user defined biological processes or random forests.
  7. We applied the above procedure to multiple cancer types to obtain cancer-specific lists of TCGs and OCGs . We compared the lists for each cancer: if a sDEG was TSG in a cancer and OCG in another we defined it as dual-role TSG-OCG. Otherwise if we found a sDEG defined as OCG or TSG only in one tissue we defined it tissue specific biomarker.
  8. We use the COSMIC database to define a list of gold standard TSG and OCGs to assess the accuracy of the proposed method.

1 For the devel version of MoonlightR we use a short extract of 100 biological functions from QIAGEN’S Ingenuity Pathway Analysis (IPA). We are still working to integrate the package.

Installation

To install use the code below.

source("https://bioconductor.org/biocLite.R")
biocLite("MoonlightR")

Citation

Please cite TCGAbiolinks package:

Related publications to this package:

Download: Get TCGA data

You can search TCGA data using the getDataTCGA function.

getDataTCGA: Searching by cancer type and data type [Gene Expression]

The user can query and download the cancer types supported by TCGA, using the function getDataTCGA: In this example we used LUAD gene expression data with only 10 samples to reduce time downloading.

dataFilt <- getDataTCGA(cancerType = "LUAD", 
                          dataType = "Gene expression",
                          directory = "data",
                          nSample = 10)

getDataTCGA: Searching by cancer type and data type [Methylation]

The user can also query and download methylation data using the function getDataTCGA:

setwd("~/Dropbox/IB2_postdoc/Github/Moonlight/")
dataFilt <- getDataTCGA(cancerType = "TCGA-BRCA", 
                          dataType = "Methylation", 
                          directory = "data",nSample = 5)

Download: Get GEO data

You can search GEO data using the getDataGEO function.

GEO_TCGAtab a 18x12 matrix that provides the GEO data set we matched to one of the 18 given TCGA cancer types

knitr::kable(GEO_TCGAtab, digits = 2, 
             caption = "Table with GEO data set matched to one 
             of the 18 given TCGA cancer types ",
             row.names = TRUE)
Cancer TP NT DEG. Dataset TP.1 NT.1 Platform DEG.. Common GEO_Normal GEO_Tumor
1 BLCA 408 19 2937 GSE13507 165 10 GPL65000 2099 896 control cancer
2 BRCA 1097 114 3390 GSE39004 61 47 GPL6244 2449 1248 normal Tumor
3 CHOL 36 9 5015 GSE26566 104 59 GPL6104 3983 2587 Surrounding Tumor
4 COAD 286 41 3788 GSE41657 25 12 GPL6480 3523 1367 N A
5 ESCA 184 11 2525 GSE20347 17 17 GPL571 1316 406 normal carcinoma
6 GBM 156 5 4828 GSE50161 34 13 GPL570 4504 2660 normal GBM
7 HNSC 520 44 2973 GSE6631 22 22 GPL8300 142 129 normal cancer
8 KICH 66 25 4355 GSE15641 6 23 GPL96 1789 680 normal chromophobe
9 KIRC 533 72 3618 GSE15641 32 23 GPL96 2911 939 normal clear cell RCC
10 KIRP 290 32 3748 GSE15641 11 23 GPL96 2020 756 normal papillary RCC
11 LIHC 371 50 3043 GSE45267 46 41 GPL570 1583 860 normal liver HCC sample
12 LUAD 515 59 3498 GSE10072 58 49 GPL96 666 555 normal tumor
13 LUSC 503 51 4984 GSE33479 14 27 GPL6480 3729 1706 normal squamous cell carcinoma
14 PRAD 497 52 1860 GSE6919 81 90 GPL8300 246 149 normal prostate tumor samples
15 READ 94 10 3628 GSE20842 65 65 GPL4133 2172 1261 M T
16 STAD 415 35 2622 GSE2685 10 10 GPL80 487 164 N T
17 THCA 505 59 1994 GSE33630 60 45 GPL570 1451 781 N T
18 UCEC 176 24 4183 GSE17025 GPL570 tp lcm

getDataGEO: Searching by cancer type and data type [Gene Expression]

The user can query and download the cancer types supported by GEO, using the function getDataGEO:

dataFilt <- getDataGEO(GEOobject = "GSE20347",platform = "GPL571")
dataFilt <- getDataGEO(TCGAtumor = "ESCA")

Analysis: To analyze TCGA data

DPA: Differential Phenotype Analysis

Differential Phenotype analysis is able to identify genes or probes that are significantly different between two phenotypes such as normal vs. tumor, or normal vs. stageI, normal vs. molecular subtype.

For gene expression data, DPA is running a differential expression analysis (DEA) to identify differentially expressed genes (DEGs) using the TCGAanalyze_DEA function from .

For methylation data DPA is running a differentially methylated regions analysis (DMR) to identify differentially methylated CpG sites using the TCGAanalyze_DMR the TCGAanalyze_DMR function from .

dataDEGs <- DPA(dataFilt = dataFilt,
                dataType = "Gene expression")

For gene expression data, DPA dealing with GEO data is running a differential expression analysis (DEA) to identify differentially expressed genes (DEGs) using to the eBayes and topTable functions from .

DataAnalysisGEO<- "../GEO_dataset/"
i<-5

cancer <- GEO_TCGAtab$Cancer[i]
cancerGEO <- GEO_TCGAtab$Dataset[i]
cancerPLT <-GEO_TCGAtab$Platform[i]
fileCancerGEO <- paste0(cancer,"_GEO_",cancerGEO,"_",cancerPLT, ".RData")

dataFilt <- getDataGEO(TCGAtumor = cancer)

GEOdegs <- DPA(dataConsortium = "GEO",
               gset = dataFilt ,
               colDescription = "title",
               samplesType  = c(GEO_TCGAtab$GEO_Normal[i],
                                GEO_TCGAtab$GEO_Tumor[i]),
               fdr.cut = 0.01,
               logFC.cut = 1,
               gsetFile = paste0(DataAnalysisGEO,fileCancerGEO))

We can visualize those differentially expressed genes (DEGs) with a volcano plot using the TCGAVisualize_volcano function from .

library(TCGAbiolinks)
TCGAVisualize_volcano(DEGsmatrix$logFC, DEGsmatrix$FDR,
                      filename = "DEGs_volcano.png",
                      x.cut = 7,
                      y.cut = 10^-5,
                      names = rownames(dataDEGs),
                      color = c("black","red","dodgerblue3"),
                      names.size = 2,
                      xlab = " Gene expression fold change (Log2)",
                      legend = "State",
                      title = "Volcano plot (Normal NT vs Tumor TP)",
                      width = 10)

The figure resulted from the code above is shown below: plot of chunk unnamed-chunk-12

LPA: Literature Phenotype Analysis

The user can perform a literature phenotype analysis using the function LPA.

data(DEGsmatrix)

DiseaseListNew <- list()
BPselected <- c("apoptosis","proliferation of cells")

for (i in 1:length(BPselected)){
  BPannotations <- DiseaseList[[which(names(DiseaseList) == BPselected[i])]]$ID
  dataLPA <- LPA(dataDEGs = DEGsmatrix[1:50,],
               BP =  BPselected[i],
               BPlist = BPannotations)
  DiseaseListNew[[length(DiseaseListNew)+1]] <- dataLPA
  names(DiseaseListNew)[[i]] <- BPselected[i]
}
## Warning in Medline(Result, query): NAs introduced by coercion

## Warning in Medline(Result, query): NAs introduced by coercion

## Warning in Medline(Result, query): NAs introduced by coercion

## Warning in Medline(Result, query): NAs introduced by coercion

## Warning in Medline(Result, query): NAs introduced by coercion

## Warning in Medline(Result, query): NAs introduced by coercion

FEA: Functional Enrichment Analysis

The user can perform a functional enrichment analysis using the function FEAcomplete. For each DEG in the gene set a z-score is calculated. This score indicates how the genes act in the gene set.

dataFEA <- FEA(DEGsmatrix = DEGsmatrix)

The output can be visualized with a FEA plot.

FEAplot: Functional Enrichment Analysis Plot

The user can plot the result of a functional enrichment analysis using the function plotFEA. A negative z-score indicates that the process’ activity is decreased. A positive z-score indicates that the process’ activity is increased.

plotFEA(dataFEA = dataFEA, plotNAME = "FEAplot", height = 20, width = 10)

The figure generated by the above code is shown below: plot of chunk unnamed-chunk-16

GRN: Gene Regulatory Network

The user can perform a gene regulatory network analysis using the function GRN which infers the network using the parmigene package.

dataGRN <- GRN(TFs = rownames(DEGsmatrix)[1:10], normCounts = dataFilt,
                   nGenesPerm = 10,kNearest = 3,nBoot = 10)

URA: Upstream Regulator Analysis

The user can perform upstream regulator analysis using the function URA. This function is applied to each DEG in the enriched gene set and its neighbors in the GRN.

data(dataGRN)
data(DEGsmatrix)
dataURA <- URA(dataGRN = dataGRN,
               DEGsmatrix = DEGsmatrix,
               BPname = NULL)

PRA: Pattern Regognition Analysis

The user can retrieve TSG/OCG candidates using either selected biological processes or a random forest classifier trained on known COSMIC OCGs/TSGs.

data(dataURA)
dataDual <- PRA(dataURA = dataURA,
                          BPname = c("apoptosis","proliferation of cells"),
                          thres.role = 0)

The figure generated by the above code is shown below: plot of chunk unnamed-chunk-20

Moonlight Replication in Cosmic cancer genes:

Running moonlight with a list of already validated TSG and OCG.

CosmicGenes <- c(knownDriverGenes$OCG, knownDriverGenes$TSG)
   
dataFilt <- getDataTCGA(cancerType = "BRCA", 
                          dataType = "Gene expression",
                          directory = "data",
                          nSample = 10)

dataDEGs <- DPA(dataFilt = dataFilt,
                dataType = "Gene expression")

dataFEA <- FEA(DEGsmatrix = dataDEGs)

dataGRN <- GRN(TFs = CosmicGenes, 
               DEGsmatrix = dataDEGs,
               DiffGenes = TRUE,
               normCounts = dataFilt)

dataURA <-URA(dataGRN = dataGRN, 
              DEGsmatrix = dataDEGs, 
              BPname = c("apoptosis",
                         "proliferation of cells"))

dataDual <-PRA(dataURA = dataURA, 
               BPname = c("apoptosis",
                          "proliferation of cells"),
               thres.role = 1)

plotNetworkHive: GRN hive visualization taking into account Cosmic cancer genes

In the following plot the nodes are separated into three groups: known tumor suppressor genes (yellow), known oncogenes (green) and the rest (gray).

data(knownDriverGenes)
data(dataGRN)
plotNetworkHive(dataGRN, knownDriverGenes, 0.55)

plot of chunk unnamed-chunk-23

TCGA Downstream Analysis: Case Studies

Introduction

This vignette shows a complete workflow of the ‘MoonlightR’ package. The code is divided in 4 case study:

Case study n. 1: Downstream analysis LUAD

dataFilt <- getDataTCGA(cancerType = "LUAD", 
                          dataType = "Gene expression",
                          directory = "data",
                          nSample = 10)

DEGsmatrix <- DPA(dataFilt = dataFilt,
                dataType = "Gene expression")

dataFEA <- FEA(DEGsmatrix = dataDEGs)

dataGRN <- GRN(TFs = rownames(dataDEGs)[1:1000], 
               DEGsmatrix = dataDEGs,
               DiffGenes = TRUE,
               normCounts = dataFilt)
dataURA <-URA(dataGRN = dataGRN, 
              DEGsmatrix = DEGsmatrix, 
              BPname = c("apoptosis",
                         "proliferation of cells"))

dataDual <-PRA(dataURA = dataURA, 
               BPname = c("apoptosis",
                          "proliferation of cells"),
               thres.role = 1)

CancerGenes <- list("TSG"=names(dataDual$TSG), "OCG"=names(dataDual$OCG))

plotURA: Upstream regulatory analysis plot

The user can plot upstream regulatory analysis using the function plotURA.

 plotURA(dataURA = dataURA[c(names(dataDual$TSG), names(dataDual$OCG)),],plotNAME = "URAplot")

The figure resulted from the code above is shown below: plot of chunk unnamed-chunk-26

Case study n. 2: Expression pipeline Pan Cancer 5 cancer types

cancerList <- c("BLCA","COAD","ESCA","HNSC","STAD")

listMoonlight <- moonlight(cancerType = cancerList, 
                      dataType = "Gene expression",
                      directory = "data",
                      nSample = 10,
                      nTF = 100,
                      DiffGenes = TRUE,
                      BPname = c("apoptosis","proliferation of cells"))
save(listMoonlight, file = paste0("listMoonlight_ncancer4.Rdata"))

plotCircos: Moonlight Circos Plot

The results of the moonlight pipeline can be visualized with a circos plot. Outer ring: color by cancer type, Inner ring: OCGs and TSGs, Inner connections: green: common OCGs yellow: common TSGs red: possible dual role

plotCircos(listMoonlight = listMoonlight, additionalFilename = "_ncancer5")

The figure resulted from the code above is shown below: plot of chunk unnamed-chunk-29

Case study n. 3: Downstream analysis BRCA with stages

setwd("/Users/antoniocolaprico/Dropbox/IB2_postdoc/Github/Moonlight/")
load("/Users/antoniocolaprico/Dropbox/IB2_postdoc/Github/PackageTesting/TCGAbiolinks/data/geneInfo.rda")
load("/Users/antoniocolaprico/Downloads/BRCAlistMoonlight_stages-2.RData")
require(MoonlightR)
require(TCGAbiolinks)



listMoonlight <- NULL
for (i in 1:4){
    dataDual <- moonlight(cancerType = "BRCA", 
                      dataType = "Gene expression",
                      directory = "data",
                      nSample = 10,
                      nTF = 100,
                      DiffGenes = TRUE,
                      BPname = c("apoptosis","proliferation of cells"),
                      stage = i)
    listMoonlight <- c(listMoonlight, list(dataDual))
    save(dataDual, file = paste0("dataDual_stage",as.roman(i), ".Rdata"))
}
names(listMoonlight) <- c("stage1", "stage2", "stage3", "stage4")

# Prepare mutation's data for stages

mutation <- GDCquery_Maf(tumor = "BRCA")


  dataClin <- GDCquery_clinic(project = "TCGA-BRCA",type = "clinical_patient")
  dataClin <- GDCquery_clinic(project = "TCGA-HNSC",type = "clinical_patient")              


res.mutation <- NULL
for(stage in 1:4){
  
  curStage <- paste0("Stage ", as.roman(stage))
                dataClin$tumor_stage <- toupper(dataClin$tumor_stage)
                dataClin$tumor_stage <- gsub("[ABCDEFGH]","",dataClin$tumor_stage)
                dataClin$tumor_stage <- gsub("ST","Stage",dataClin$tumor_stage)

                dataStg <- dataClin[dataClin$tumor_stage %in% curStage,]
                message(paste(curStage, "with", nrow(dataStg), "samples"))
dataSmTP <- mutation$Tumor_Sample_Barcode
                
                dataStgC <- dataSmTP[substr(dataSmTP,1,12) %in% dataStg$bcr_patient_barcode]
                dataSmTP <- dataStgC
  
                info.mutation <- mutation[mutation$Tumor_Sample_Barcode %in% dataSmTP,]
  
     ind <- which(info.mutation[,"Consequence"]=="inframe_deletion")
     ind2 <- which(info.mutation[,"Consequence"]=="inframe_insertion")
     ind3 <- which(info.mutation[,"Consequence"]=="missense_variant")
    res.mutation <- c(res.mutation, list(info.mutation[c(ind, ind2, ind3),c(1,51)]))
    }
names(res.mutation) <- c("stage1", "stage2", "stage3", "stage4")


tmp <- NULL
tmp <- c(tmp, list(listMoonlight[[1]][[1]]))
tmp <- c(tmp, list(listMoonlight[[2]][[1]]))
tmp <- c(tmp, list(listMoonlight[[3]][[1]]))
tmp <- c(tmp, list(listMoonlight[[4]][[1]]))
names(tmp) <- names(listMoonlight)

 mutation <- GDCquery_Maf(tumor = "BRCA")    

 plotCircos(listMoonlight=listMoonlight,listMutation=res.mutation, additionalFilename="proc2_wmutation", intensityColDual=0.2,fontSize = 2)

The results of the moonlight pipeline can be visualized with a circos plot. Outer ring: color by cancer type, Inner ring: OCGs and TSGs, Inner connections: green: common OCGs yellow: common TSGs red: possible dual role

The figure generated by the code above is shown below: plot of chunk unnamed-chunk-31

Moonlight working with molecular subtypes data.

In this section we showed downstream analysis with replication of Moonlight in TCGA’s data within comparison of different molecular subtypes and normal samples. You can use the function TCGAquery_subtype from TCGAbiolinks to retrieve this information.

The Cancer Genome Atlas (TCGA) Research Network has reported integrated genome-wide studies of various diseases. We have added some of the subtypes defined by these report in our package. The ACC(Cancer Genome Atlas Research Network and others 2016), BRCA (Cancer Genome Atlas Research Network and others 2012c), COAD (Cancer Genome Atlas Research Network and others 2012b), GBM (Ceccarelli, Michele and Barthel, Floris P and Malta, Tathiane M and Sabedot, Thais S and Salama, Sofie R and Murray, Bradley A and Morozova, Olena and Newton, Yulia and Radenbaugh, Amie and Pagnotta, Stefano M and others 2016), HNSC (Cancer Genome Atlas Research Network and others 2015a), KICH (Davis, Caleb F and Ricketts, Christopher J and Wang, Min and Yang, Lixing and Cherniack, Andrew D and Shen, Hui and Buhay, Christian and Kang, Hyojin and Kim, Sang Cheol and Fahey, Catherine C and others 2014), KIRC(Cancer Genome Atlas Research Network and others 2013a), KIRP (Linehan, W Marston and Spellman, Paul T and Ricketts, Christopher J and Creighton, Chad J and Fei, Suzanne S and Davis, Caleb and Wheeler, David A and Murray, Bradley A and Schmidt, Laura and Vocke, Cathy D and others 2016), LGG (Ceccarelli, Michele and Barthel, Floris P and Malta, Tathiane M and Sabedot, Thais S and Salama, Sofie R and Murray, Bradley A and Morozova, Olena and Newton, Yulia and Radenbaugh, Amie and Pagnotta, Stefano M and others 2016), LUAD (Cancer Genome Atlas Research Network and others 2014b), LUSC(Cancer Genome Atlas Research Network and others 2012a), PRAD(Cancer Genome Atlas Research Network and others 2015c), READ (Cancer Genome Atlas Research Network and others 2012b), SKCM (Cancer Genome Atlas Research Network and others 2015b), STAD (Cancer Genome Atlas Research Network and others 2014a), THCA (Cancer Genome Atlas Research Network and others 2014c), UCEC (Cancer Genome Atlas Research Network and others 2013b) tumors have data added.

A subset of the lgg subytpe is shown below:

## Subtype information from:doi:10.1016/j.cell.2015.12.028
patient Tissue.source.site Study BCR
1 TCGA-CS-4938 Thomas Jefferson University Brain Lower Grade Glioma IGC
2 TCGA-CS-4941 Thomas Jefferson University Brain Lower Grade Glioma IGC
3 TCGA-CS-4942 Thomas Jefferson University Brain Lower Grade Glioma IGC
4 TCGA-CS-4943 Thomas Jefferson University Brain Lower Grade Glioma IGC
5 TCGA-CS-4944 Thomas Jefferson University Brain Lower Grade Glioma IGC
6 TCGA-CS-5390 Thomas Jefferson University Brain Lower Grade Glioma IGC
7 TCGA-CS-5393 Thomas Jefferson University Brain Lower Grade Glioma IGC
8 TCGA-CS-5394 Thomas Jefferson University Brain Lower Grade Glioma IGC
9 TCGA-CS-5395 Thomas Jefferson University Brain Lower Grade Glioma IGC
10 TCGA-CS-5396 Thomas Jefferson University Brain Lower Grade Glioma IGC

Moonlight working with mutation data.

In this section we showed downstream analysis with replication of Moonlight in TCGA’s data within comparison of different molecular subtypes and normal samples. You can use the function GDCquery_Maf from TCGAbiolinks to retrieve this information.

A subset of the lgg subytpe is shown below:

## Subtype information from:doi:10.1016/j.cell.2015.12.028
patient Tissue.source.site Study BCR
1 TCGA-CS-4938 Thomas Jefferson University Brain Lower Grade Glioma IGC
2 TCGA-CS-4941 Thomas Jefferson University Brain Lower Grade Glioma IGC
3 TCGA-CS-4942 Thomas Jefferson University Brain Lower Grade Glioma IGC
4 TCGA-CS-4943 Thomas Jefferson University Brain Lower Grade Glioma IGC
5 TCGA-CS-4944 Thomas Jefferson University Brain Lower Grade Glioma IGC
6 TCGA-CS-5390 Thomas Jefferson University Brain Lower Grade Glioma IGC
7 TCGA-CS-5393 Thomas Jefferson University Brain Lower Grade Glioma IGC
8 TCGA-CS-5394 Thomas Jefferson University Brain Lower Grade Glioma IGC
9 TCGA-CS-5395 Thomas Jefferson University Brain Lower Grade Glioma IGC
10 TCGA-CS-5396 Thomas Jefferson University Brain Lower Grade Glioma IGC

Session Information ******

sessionInfo()
## R version 3.3.1 (2016-06-21)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X 10.9.5 (Mavericks)
## 
## locale:
## [1] C/UTF-8/C/C/C/C
## 
## attached base packages:
## [1] grid      parallel  stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
## [1] TCGAbiolinks_2.1.13 png_0.1-7           MoonlightR_0.99.1  
## [4] doParallel_1.0.10   iterators_1.0.8     foreach_1.4.3      
## [7] knitr_1.14         
## 
## loaded via a namespace (and not attached):
##   [1] circlize_0.3.9                         
##   [2] fastmatch_1.0-4                        
##   [3] aroma.light_3.3.2                      
##   [4] plyr_1.8.4                             
##   [5] igraph_1.0.1                           
##   [6] ConsensusClusterPlus_1.37.0            
##   [7] splines_3.3.1                          
##   [8] BiocParallel_1.7.9                     
##   [9] GenomeInfoDb_1.9.14                    
##  [10] ggplot2_2.1.0                          
##  [11] TH.data_1.0-7                          
##  [12] digest_0.6.10                          
##  [13] BiocInstaller_1.23.9                   
##  [14] GOSemSim_1.99.4                        
##  [15] GO.db_3.4.0                            
##  [16] gdata_2.17.0                           
##  [17] magrittr_1.5                           
##  [18] memoise_1.0.0                          
##  [19] cluster_2.0.5                          
##  [20] limma_3.29.21                          
##  [21] ComplexHeatmap_1.11.7                  
##  [22] Biostrings_2.41.4                      
##  [23] readr_1.0.0                            
##  [24] annotate_1.51.1                        
##  [25] matrixStats_0.51.0                     
##  [26] R.utils_2.4.0                          
##  [27] sandwich_2.3-4                         
##  [28] jpeg_0.1-8                             
##  [29] colorspace_1.2-6                       
##  [30] rvest_0.3.2                            
##  [31] ggrepel_0.5                            
##  [32] dplyr_0.5.0                            
##  [33] jsonlite_1.1                           
##  [34] hexbin_1.27.1                          
##  [35] RCurl_1.95-4.8                         
##  [36] graph_1.51.0                           
##  [37] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
##  [38] roxygen2_5.0.1                         
##  [39] supraHex_1.11.2                        
##  [40] genefilter_1.55.2                      
##  [41] GEOquery_2.39.4                        
##  [42] ape_3.5                                
##  [43] survival_2.39-5                        
##  [44] zoo_1.7-13                             
##  [45] gtable_0.2.0                           
##  [46] zlibbioc_1.19.0                        
##  [47] XVector_0.13.7                         
##  [48] GetoptLong_0.1.5                       
##  [49] kernlab_0.9-25                         
##  [50] Rgraphviz_2.17.0                       
##  [51] shape_1.4.2                            
##  [52] prabclus_2.2-6                         
##  [53] BiocGenerics_0.19.2                    
##  [54] DEoptimR_1.0-6                         
##  [55] scales_0.4.0                           
##  [56] DOSE_2.99.0                            
##  [57] HiveR_0.2.55                           
##  [58] DESeq_1.25.0                           
##  [59] mvtnorm_1.0-5                          
##  [60] edgeR_3.15.6                           
##  [61] DBI_0.5-1                              
##  [62] GGally_1.2.0                           
##  [63] ggthemes_3.2.0                         
##  [64] Rcpp_0.12.7                            
##  [65] xtable_1.8-2                           
##  [66] matlab_1.0.2                           
##  [67] mclust_5.2                             
##  [68] preprocessCore_1.35.0                  
##  [69] stats4_3.3.1                           
##  [70] httr_1.2.1                             
##  [71] fgsea_0.99.7                           
##  [72] gplots_3.0.1                           
##  [73] RColorBrewer_1.1-2                     
##  [74] fpc_2.1-10                             
##  [75] modeltools_0.2-21                      
##  [76] reshape_0.8.5                          
##  [77] XML_3.98-1.4                           
##  [78] R.methodsS3_1.7.1                      
##  [79] flexmix_2.3-13                         
##  [80] nnet_7.3-12                            
##  [81] RISmed_2.1.5                           
##  [82] locfit_1.5-9.1                         
##  [83] labeling_0.3                           
##  [84] reshape2_1.4.1                         
##  [85] AnnotationDbi_1.35.4                   
##  [86] munsell_0.4.3                          
##  [87] tools_3.3.1                            
##  [88] downloader_0.4                         
##  [89] RSQLite_1.0.0                          
##  [90] devtools_1.12.0                        
##  [91] evaluate_0.9                           
##  [92] stringr_1.1.0                          
##  [93] robustbase_0.92-6                      
##  [94] caTools_1.17.1                         
##  [95] randomForest_4.6-12                    
##  [96] dendextend_1.3.0                       
##  [97] coin_1.1-2                             
##  [98] nlme_3.1-128                           
##  [99] EDASeq_2.7.2                           
## [100] whisker_0.3-2                          
## [101] formatR_1.4                            
## [102] R.oo_1.20.0                            
## [103] xml2_1.0.0                             
## [104] DO.db_2.9                              
## [105] biomaRt_2.29.2                         
## [106] compiler_3.3.1                         
## [107] affyio_1.43.0                          
## [108] tibble_1.2                             
## [109] geneplotter_1.51.0                     
## [110] stringi_1.1.2                          
## [111] highr_0.6                              
## [112] GenomicFeatures_1.25.20                
## [113] lattice_0.20-34                        
## [114] trimcluster_0.1-2                      
## [115] Matrix_1.2-7.1                         
## [116] GlobalOptions_0.0.10                   
## [117] data.table_1.9.6                       
## [118] bitops_1.0-6                           
## [119] parmigene_1.0.2                        
## [120] dnet_1.0.9                             
## [121] rtracklayer_1.33.12                    
## [122] GenomicRanges_1.25.94                  
## [123] qvalue_2.5.2                           
## [124] R6_2.2.0                               
## [125] latticeExtra_0.6-28                    
## [126] affy_1.51.1                            
## [127] hwriter_1.3.2                          
## [128] ShortRead_1.31.1                       
## [129] KernSmooth_2.23-15                     
## [130] gridExtra_2.2.1                        
## [131] IRanges_2.7.17                         
## [132] codetools_0.2-15                       
## [133] MASS_7.3-45                            
## [134] gtools_3.5.0                           
## [135] assertthat_0.1                         
## [136] chron_2.3-47                           
## [137] SummarizedExperiment_1.3.82            
## [138] rjson_0.2.15                           
## [139] withr_1.0.2                            
## [140] GenomicAlignments_1.9.6                
## [141] Rsamtools_1.25.2                       
## [142] multcomp_1.4-6                         
## [143] S4Vectors_0.11.19                      
## [144] diptest_0.75-7                         
## [145] clusterProfiler_3.1.8                  
## [146] tidyr_0.6.0                            
## [147] class_7.3-14                           
## [148] Biobase_2.33.4

References

Cancer Genome Atlas Research Network and others. 2012a. “Comprehensive Genomic Characterization of Squamous Cell Lung Cancers.” doi:10.1038/nature11404.

———. 2012b. “Comprehensive Molecular Characterization of Human Colon and Rectal Cancer.” doi:10.1038/nature11252.

———. 2012c. “Comprehensive Molecular Portraits of Human Breast Tumours.” doi:10.1038/nature11412.

———. 2013a. “Comprehensive Molecular Characterization of Clear Cell Renal Cell Carcinoma.” doi:10.1038/nature12222.

———. 2013b. “Integrated Genomic Characterization of Endometrial Carcinoma.” doi:10.1038/nature12113.

———. 2014a. “Comprehensive Molecular Characterization of Gastric Adenocarcinoma.” doi:10.1038/nature13480.

———. 2014b. “Comprehensive Molecular Profiling of Lung Adenocarcinoma.” doi:10.1038/nature13385.

———. 2014c. “Integrated Genomic Characterization of Papillary Thyroid Carcinoma.” doi:10.1016/j.cell.2014.09.050.

———. 2015a. “Comprehensive Genomic Characterization of Head and Neck Squamous Cell Carcinomas.” doi:10.1038/nature14129.

———. 2015b. “Genomic Classification of Cutaneous Melanoma.” doi:10.1016/j.cell.2015.05.044.

———. 2015c. “The Molecular Taxonomy of Primary Prostate Cancer.” doi:10.1016/j.cell.2015.10.025.

———. 2016. “Comprehensive Pan-Genomic Characterization of Adrenocortical Carcinoma.” doi:10.1016/j.ccell.2016.04.002.

Ceccarelli, Michele and Barthel, Floris P and Malta, Tathiane M and Sabedot, Thais S and Salama, Sofie R and Murray, Bradley A and Morozova, Olena and Newton, Yulia and Radenbaugh, Amie and Pagnotta, Stefano M and others. 2016. “Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma.” doi:10.1016/j.cell.2015.12.028.

Colaprico, Antonio and Silva, Tiago C. and Olsen, Catharina and Garofano, Luciano and Cava, Claudia and Garolini, Davide and Sabedot, Thais S. and Malta, Tathiane M. and Pagnotta, Stefano M. and Castiglioni, Isabella and Ceccarelli, Michele and Bontempi, Gianluca and Noushmehr, Houtan. 2016. “TCGAbiolinks: An R/Bioconductor Package for Integrative Analysis of TCGA Data.” doi:10.1093/nar/gkv1507.

Davis, Caleb F and Ricketts, Christopher J and Wang, Min and Yang, Lixing and Cherniack, Andrew D and Shen, Hui and Buhay, Christian and Kang, Hyojin and Kim, Sang Cheol and Fahey, Catherine C and others. 2014. “The Somatic Genomic Landscape of Chromophobe Renal Cell Carcinoma.” doi:10.1016/j.ccr.2014.07.014.

Linehan, W Marston and Spellman, Paul T and Ricketts, Christopher J and Creighton, Chad J and Fei, Suzanne S and Davis, Caleb and Wheeler, David A and Murray, Bradley A and Schmidt, Laura and Vocke, Cathy D and others. 2016. “Comprehensive Molecular Characterization of Papillary Renal-Cell Carcinoma.” doi:10.1056/NEJMoa1505917.

Silva, TC and Colaprico, A and Olsen, C and D’Angelo, F and Bontempi, G and Ceccarelli, M and Noushmehr, H. 2016. “TCGA Workflow: Analyze Cancer Genomics and Epigenomics Data Using Bioconductor Packages.” doi:10.12688/f1000research.8923.1.